Visit-With-Us¶
Business Context¶
"Visit with Us," a leading travel company, is revolutionizing the tourism industry by leveraging data-driven strategies to optimize operations and customer engagement. While introducing a new package offering, such as the Wellness Tourism Package, the company faces challenges in targeting the right customers efficiently. The manual approach to identifying potential customers is inconsistent, time-consuming, and prone to errors, leading to missed opportunities and suboptimal campaign performance.
To address these issues, the company aims to implement a scalable and automated system that integrates customer data, predicts potential buyers, and enhances decision-making for marketing strategies. By utilizing an MLOps pipeline, the company seeks to achieve seamless integration of data preprocessing, model development, deployment, and CI/CD practices for continuous improvement. This system will ensure efficient targeting of customers, timely updates to the predictive model, and adaptation to evolving customer behaviors, ultimately driving growth and customer satisfaction
Objective¶
As an MLOps Engineer at "Visit with Us," your responsibility is to design and deploy an MLOps pipeline on GitHub to automate the end-to-end workflow for predicting customer purchases. The primary objective is to build a model that predicts whether a customer will purchase the newly introduced Wellness Tourism Package before contacting them. The pipeline will include data cleaning, preprocessing, transformation, model building, training, evaluation, and deployment, ensuring consistent performance and scalability. By leveraging GitHub Actions for CI/CD integration, the system will enable automated updates, streamline model deployment, and improve operational efficiency. This robust predictive solution will empower policymakers to make data-driven decisions, enhance marketing strategies, and effectively target potential customers, thereby driving customer acquisition and business growth.
Data Description¶
The dataset contains customer and interaction data that serve as key attributes for predicting the likelihood of purchasing the Wellness Tourism Package. The detailed attributes are:
Customer Details
- CustomerID: Unique identifier for each customer.
- ProdTaken: Target variable indicating whether the customer has purchased a package (0: No, 1: Yes).
- Age: Age of the customer.
- TypeofContact: The method by which the customer was contacted (Company Invited or Self Inquiry).
- CityTier: The city category based on development, population, and living standards (Tier 1 > Tier 2 > Tier 3).
- Occupation: Customer's occupation (e.g., Salaried, Freelancer).
- Gender: Gender of the customer (Male, Female).
- NumberOfPersonVisiting: Total number of people accompanying the customer on the trip.
- PreferredPropertyStar: Preferred hotel rating by the customer.
- MaritalStatus: Marital status of the customer (Single, Married, Divorced).
- NumberOfTrips: Average number of trips the customer takes annually.
- Passport: Whether the customer holds a valid passport (0: No, 1: Yes).
- OwnCar: Whether the customer owns a car (0: No, 1: Yes).
- NumberOfChildrenVisiting: Number of children below age 5 accompanying the customer.
- Designation: Customer's designation in their current organization.
- MonthlyIncome: Gross monthly income of the customer.
Customer Interaction Data
- PitchSatisfactionScore: Score indicating the customer's satisfaction with the sales pitch.
- ProductPitched: The type of product pitched to the customer.
- NumberOfFollowups: Total number of follow-ups by the salesperson after the sales pitch.-
- DurationOfPitch: Duration of the sales pitch delivered to the customer.
Project Folder Structure¶
|--> VisitWithUs-Tourism_version_1_1
|--> Master
|--> Data #STORING DATASET FILE
|--> tourism.csv
|--> test.csv
|--> train.csv
|--> Model_Dump_JOBLIB #STOING PROGRAM GENERATED MODELS
|--> best_threshold.txt
|--> best_XGBoostingClassifier.joblib
|--> XGBoostingClassifier.joblib
|--> XGBoostingClassifier_ConfusionMatrix.png
|--> RandomForestClassifier.joblib
|--> RandomForestClassifier_ConfusionMatrix.png
|--> GradientBoostingClassifier.joblib
|--> GradientBoostingClassifier_ConfusionMatrix.png
|--> DecisionTreeClassifier.joblib
|--> DecisionTreeClassifier_ConfusionMatrix.png
|--> Deployment # STORING STREAMLIT DEPLOYMENT FILE
|--> app.py
|--> requirement.txt
|--> README.md
|--> DockerFile
|--> Visit-With-Us-Tourism-Prediction_v1_1.ipynb
|--> DataRegistration.py
|--> DataPrepration.py
|--> BuildingModels.py
|--> HostingInHuggingFace.py
|--> main.py
|--> .gitignore
|--> .env
|--> README.md
|--> mlruns
|--> models
|--> 674721534787404130
|--> .trash
|--> .github
|--> workflows
|--> pipeline.yml
INSTALLING PACKAGES¶
- Huggingface hub - To interact with Hugging face programtically like creating space, dataset, models, streamlit deployment
- python-dotenv: For storing the credentials in .env file
- datasets: To create datasets and loading in Hugging face
- pandas: Data manipulations (dataframe)
- scikit-learn: To create Ensemmble models, train test split and metrics calculation
- XGBOOST: creating xbboosting classifier models
- seaborn & matplotlib: to create visuals
- JOBLIB: To create model dump
- Streamlit: To create front end creation
!pip install huggingface_hub
!pip install python-dotenv
!pip install datasets
!pip install pandas
!pip install scikit-learn
!pip install xgboost
!pip install seaborn
!pip install matplotlib
!pip install joblib
!pip install stramlit
!pip install mlflow
!pip install pyngrok
!pip install setuptools
Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.12/dist-packages (0.34.4) Requirement already satisfied: filelock in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (3.19.1) Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (2025.3.0) Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (25.0) Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (6.0.2) Requirement already satisfied: requests in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (2.32.4) Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (4.67.1) Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (4.14.1) Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /usr/local/lib/python3.12/dist-packages (from huggingface_hub) (1.1.7) Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (3.4.3) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (2.5.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests->huggingface_hub) (2025.8.3) Requirement already satisfied: python-dotenv in /usr/local/lib/python3.12/dist-packages (1.1.1) Requirement already satisfied: datasets in /usr/local/lib/python3.12/dist-packages (4.0.0) Requirement already satisfied: filelock in /usr/local/lib/python3.12/dist-packages (from datasets) (3.19.1) Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.12/dist-packages (from datasets) (2.0.2) Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.12/dist-packages (from datasets) (18.1.0) Requirement already satisfied: dill<0.3.9,>=0.3.0 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.3.8) Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (from datasets) (2.2.2) Requirement already satisfied: requests>=2.32.2 in /usr/local/lib/python3.12/dist-packages (from datasets) (2.32.4) Requirement already satisfied: tqdm>=4.66.3 in /usr/local/lib/python3.12/dist-packages (from datasets) (4.67.1) Requirement already satisfied: xxhash in /usr/local/lib/python3.12/dist-packages (from datasets) (3.5.0) Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.70.16) Requirement already satisfied: fsspec<=2025.3.0,>=2023.1.0 in /usr/local/lib/python3.12/dist-packages (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (2025.3.0) Requirement already satisfied: huggingface-hub>=0.24.0 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.34.4) Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from datasets) (25.0) Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.12/dist-packages (from datasets) (6.0.2) Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.12/dist-packages (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (3.12.15) Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.12/dist-packages (from huggingface-hub>=0.24.0->datasets) (4.14.1) Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /usr/local/lib/python3.12/dist-packages (from huggingface-hub>=0.24.0->datasets) (1.1.7) Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (3.4.3) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (2.5.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests>=2.32.2->datasets) (2025.8.3) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas->datasets) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas->datasets) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas->datasets) (2025.2) Requirement already satisfied: aiohappyeyeballs>=2.5.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (2.6.1) Requirement already satisfied: aiosignal>=1.4.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (1.4.0) Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (25.3.0) Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (1.7.0) Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (6.6.4) Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (0.3.2) Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]<=2025.3.0,>=2023.1.0->datasets) (1.20.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas->datasets) (1.17.0) Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (2.2.2) Requirement already satisfied: numpy>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.0.2) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas) (1.17.0) Requirement already satisfied: scikit-learn in /usr/local/lib/python3.12/dist-packages (1.6.1) Requirement already satisfied: numpy>=1.19.5 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (2.0.2) Requirement already satisfied: scipy>=1.6.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (1.16.1) Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (1.5.1) Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn) (3.6.0) Requirement already satisfied: xgboost in /usr/local/lib/python3.12/dist-packages (3.0.4) Requirement already satisfied: numpy in /usr/local/lib/python3.12/dist-packages (from xgboost) (2.0.2) Requirement already satisfied: nvidia-nccl-cu12 in /usr/local/lib/python3.12/dist-packages (from xgboost) (2.27.3) Requirement already satisfied: scipy in /usr/local/lib/python3.12/dist-packages (from xgboost) (1.16.1) Requirement already satisfied: seaborn in /usr/local/lib/python3.12/dist-packages (0.13.2) Requirement already satisfied: numpy!=1.24.0,>=1.20 in /usr/local/lib/python3.12/dist-packages (from seaborn) (2.0.2) Requirement already satisfied: pandas>=1.2 in /usr/local/lib/python3.12/dist-packages (from seaborn) (2.2.2) Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in /usr/local/lib/python3.12/dist-packages (from seaborn) (3.10.0) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.3.3) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.59.1) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.9) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (25.0) Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (11.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.2.3) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas>=1.2->seaborn) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas>=1.2->seaborn) (2025.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.4->seaborn) (1.17.0) Requirement already satisfied: matplotlib in /usr/local/lib/python3.12/dist-packages (3.10.0) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.3.3) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (4.59.1) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.4.9) Requirement already satisfied: numpy>=1.23 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (2.0.2) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (25.0) Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (11.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (3.2.3) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (2.9.0.post0) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.7->matplotlib) (1.17.0) Requirement already satisfied: joblib in /usr/local/lib/python3.12/dist-packages (1.5.1) ERROR: Could not find a version that satisfies the requirement stramlit (from versions: none) ERROR: No matching distribution found for stramlit Collecting mlflow Downloading mlflow-3.3.1-py3-none-any.whl.metadata (30 kB) Collecting mlflow-skinny==3.3.1 (from mlflow) Downloading mlflow_skinny-3.3.1-py3-none-any.whl.metadata (31 kB) Collecting mlflow-tracing==3.3.1 (from mlflow) Downloading mlflow_tracing-3.3.1-py3-none-any.whl.metadata (19 kB) Requirement already satisfied: Flask<4 in /usr/local/lib/python3.12/dist-packages (from mlflow) (3.1.1) Collecting alembic!=1.10.0,<2 (from mlflow) Downloading alembic-1.16.4-py3-none-any.whl.metadata (7.3 kB) Requirement already satisfied: cryptography<46,>=43.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow) (43.0.3) Collecting docker<8,>=4.0.0 (from mlflow) Downloading docker-7.1.0-py3-none-any.whl.metadata (3.8 kB) Collecting graphene<4 (from mlflow) Downloading graphene-3.4.3-py2.py3-none-any.whl.metadata (6.9 kB) Collecting gunicorn<24 (from mlflow) Downloading gunicorn-23.0.0-py3-none-any.whl.metadata (4.4 kB) Requirement already satisfied: matplotlib<4 in /usr/local/lib/python3.12/dist-packages (from mlflow) (3.10.0) Requirement already satisfied: numpy<3 in /usr/local/lib/python3.12/dist-packages (from mlflow) (2.0.2) Requirement already satisfied: pandas<3 in /usr/local/lib/python3.12/dist-packages (from mlflow) (2.2.2) Requirement already satisfied: pyarrow<22,>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow) (18.1.0) Requirement already satisfied: scikit-learn<2 in /usr/local/lib/python3.12/dist-packages (from mlflow) (1.6.1) Requirement already satisfied: scipy<2 in /usr/local/lib/python3.12/dist-packages (from mlflow) (1.16.1) Requirement already satisfied: sqlalchemy<3,>=1.4.0 in /usr/local/lib/python3.12/dist-packages (from mlflow) (2.0.43) Requirement already satisfied: cachetools<7,>=5.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (5.5.2) Requirement already satisfied: click<9,>=7.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (8.2.1) Requirement already satisfied: cloudpickle<4 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (3.1.1) Collecting databricks-sdk<1,>=0.20.0 (from mlflow-skinny==3.3.1->mlflow) Downloading databricks_sdk-0.64.0-py3-none-any.whl.metadata (39 kB) Requirement already satisfied: fastapi<1 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (0.116.1) Requirement already satisfied: gitpython<4,>=3.1.9 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (3.1.45) Requirement already satisfied: importlib_metadata!=4.7.0,<9,>=3.7.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (8.7.0) Requirement already satisfied: opentelemetry-api<3,>=1.9.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (1.36.0) Requirement already satisfied: opentelemetry-sdk<3,>=1.9.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (1.36.0) Requirement already satisfied: packaging<26 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (25.0) Requirement already satisfied: protobuf<7,>=3.12.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (5.29.5) Requirement already satisfied: pydantic<3,>=1.10.8 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (2.11.7) Requirement already satisfied: pyyaml<7,>=5.1 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (6.0.2) Requirement already satisfied: requests<3,>=2.17.3 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (2.32.4) Requirement already satisfied: sqlparse<1,>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (0.5.3) Requirement already satisfied: typing-extensions<5,>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (4.14.1) Requirement already satisfied: uvicorn<1 in /usr/local/lib/python3.12/dist-packages (from mlflow-skinny==3.3.1->mlflow) (0.35.0) Requirement already satisfied: Mako in /usr/lib/python3/dist-packages (from alembic!=1.10.0,<2->mlflow) (1.1.3) Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.12/dist-packages (from cryptography<46,>=43.0.0->mlflow) (1.17.1) Requirement already satisfied: urllib3>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from docker<8,>=4.0.0->mlflow) (2.5.0) Requirement already satisfied: blinker>=1.9.0 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (1.9.0) Requirement already satisfied: itsdangerous>=2.2.0 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (2.2.0) Requirement already satisfied: jinja2>=3.1.2 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (3.1.6) Requirement already satisfied: markupsafe>=2.1.1 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (3.0.2) Requirement already satisfied: werkzeug>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from Flask<4->mlflow) (3.1.3) Collecting graphql-core<3.3,>=3.1 (from graphene<4->mlflow) Downloading graphql_core-3.2.6-py3-none-any.whl.metadata (11 kB) Collecting graphql-relay<3.3,>=3.1 (from graphene<4->mlflow) Downloading graphql_relay-3.2.0-py3-none-any.whl.metadata (12 kB) Requirement already satisfied: python-dateutil<3,>=2.7.0 in /usr/local/lib/python3.12/dist-packages (from graphene<4->mlflow) (2.9.0.post0) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (1.3.3) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (4.59.1) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (1.4.9) Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (11.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib<4->mlflow) (3.2.3) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas<3->mlflow) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas<3->mlflow) (2025.2) Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn<2->mlflow) (1.5.1) Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn<2->mlflow) (3.6.0) Requirement already satisfied: greenlet>=1 in /usr/local/lib/python3.12/dist-packages (from sqlalchemy<3,>=1.4.0->mlflow) (3.2.4) Requirement already satisfied: pycparser in /usr/local/lib/python3.12/dist-packages (from cffi>=1.12->cryptography<46,>=43.0.0->mlflow) (2.22) Requirement already satisfied: google-auth~=2.0 in /usr/local/lib/python3.12/dist-packages (from databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (2.38.0) Requirement already satisfied: starlette<0.48.0,>=0.40.0 in /usr/local/lib/python3.12/dist-packages (from fastapi<1->mlflow-skinny==3.3.1->mlflow) (0.47.2) Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.12/dist-packages (from gitpython<4,>=3.1.9->mlflow-skinny==3.3.1->mlflow) (4.0.12) Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.12/dist-packages (from importlib_metadata!=4.7.0,<9,>=3.7.0->mlflow-skinny==3.3.1->mlflow) (3.23.0) Requirement already satisfied: opentelemetry-semantic-conventions==0.57b0 in /usr/local/lib/python3.12/dist-packages (from opentelemetry-sdk<3,>=1.9.0->mlflow-skinny==3.3.1->mlflow) (0.57b0) Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<3,>=1.10.8->mlflow-skinny==3.3.1->mlflow) (0.7.0) Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.12/dist-packages (from pydantic<3,>=1.10.8->mlflow-skinny==3.3.1->mlflow) (2.33.2) Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<3,>=1.10.8->mlflow-skinny==3.3.1->mlflow) (0.4.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil<3,>=2.7.0->graphene<4->mlflow) (1.17.0) Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.17.3->mlflow-skinny==3.3.1->mlflow) (3.4.3) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.17.3->mlflow-skinny==3.3.1->mlflow) (3.10) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.17.3->mlflow-skinny==3.3.1->mlflow) (2025.8.3) Requirement already satisfied: h11>=0.8 in /usr/local/lib/python3.12/dist-packages (from uvicorn<1->mlflow-skinny==3.3.1->mlflow) (0.16.0) Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.12/dist-packages (from gitdb<5,>=4.0.1->gitpython<4,>=3.1.9->mlflow-skinny==3.3.1->mlflow) (5.0.2) Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.12/dist-packages (from google-auth~=2.0->databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (0.4.2) Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.12/dist-packages (from google-auth~=2.0->databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (4.9.1) Requirement already satisfied: anyio<5,>=3.6.2 in /usr/local/lib/python3.12/dist-packages (from starlette<0.48.0,>=0.40.0->fastapi<1->mlflow-skinny==3.3.1->mlflow) (4.10.0) Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.12/dist-packages (from anyio<5,>=3.6.2->starlette<0.48.0,>=0.40.0->fastapi<1->mlflow-skinny==3.3.1->mlflow) (1.3.1) Requirement already satisfied: pyasn1<0.7.0,>=0.6.1 in /usr/local/lib/python3.12/dist-packages (from pyasn1-modules>=0.2.1->google-auth~=2.0->databricks-sdk<1,>=0.20.0->mlflow-skinny==3.3.1->mlflow) (0.6.1) Downloading mlflow-3.3.1-py3-none-any.whl (26.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 26.4/26.4 MB 70.4 MB/s eta 0:00:00 Downloading mlflow_skinny-3.3.1-py3-none-any.whl (2.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 62.7 MB/s eta 0:00:00 Downloading mlflow_tracing-3.3.1-py3-none-any.whl (1.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 55.5 MB/s eta 0:00:00 Downloading alembic-1.16.4-py3-none-any.whl (247 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.0/247.0 kB 17.6 MB/s eta 0:00:00 Downloading docker-7.1.0-py3-none-any.whl (147 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 147.8/147.8 kB 11.4 MB/s eta 0:00:00 Downloading graphene-3.4.3-py2.py3-none-any.whl (114 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.9/114.9 kB 6.6 MB/s eta 0:00:00 Downloading gunicorn-23.0.0-py3-none-any.whl (85 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.0/85.0 kB 6.4 MB/s eta 0:00:00 Downloading databricks_sdk-0.64.0-py3-none-any.whl (703 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 703.4/703.4 kB 43.1 MB/s eta 0:00:00 Downloading graphql_core-3.2.6-py3-none-any.whl (203 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 203.4/203.4 kB 15.1 MB/s eta 0:00:00 Downloading graphql_relay-3.2.0-py3-none-any.whl (16 kB) Installing collected packages: gunicorn, graphql-core, graphql-relay, docker, alembic, graphene, databricks-sdk, mlflow-tracing, mlflow-skinny, mlflow Successfully installed alembic-1.16.4 databricks-sdk-0.64.0 docker-7.1.0 graphene-3.4.3 graphql-core-3.2.6 graphql-relay-3.2.0 gunicorn-23.0.0 mlflow-3.3.1 mlflow-skinny-3.3.1 mlflow-tracing-3.3.1 Collecting pyngrok Downloading pyngrok-7.3.0-py3-none-any.whl.metadata (8.1 kB) Requirement already satisfied: PyYAML>=5.1 in /usr/local/lib/python3.12/dist-packages (from pyngrok) (6.0.2) Downloading pyngrok-7.3.0-py3-none-any.whl (25 kB) Installing collected packages: pyngrok Successfully installed pyngrok-7.3.0 Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (75.2.0)
MOUNTING DRIVE¶
In this Block Mounting the goolge drive and reading the HuggingFace Token
import os
from google.colab import drive
drive.mount('/content/drive/')
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/'
base_path = os.getcwd()
print(f"Base Path {base_path}")
Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Base Path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
from google.colab import userdata
hf_token = userdata.get('HF_TOKEN')
!ls
BuildingModels.py main.py Data mlruns DataPrepration.py Model_Dump_JOBLIB DataRegistration.py __pycache__ Deployment README.md HostingInHuggingFace.py Visit-With-Us-Tourism-Prediction_v1_1.ipynb
1. DATA REGISTRATION¶
|--> class DataRegistration
|--> def __init__(self, base_path, hf_token=None)
* Contructor to assign the base_path and hugging face token
|--> def HFCreateRepo(self)
* This Function will be creating dataset in the Hugging face
|--> def UploadingSourceData(self)
* This function will upload the local tourism.csv file into Hugging face datasets
|--> def ToRunPipeline(self)
* This function invoke the Creating dataset repo and uploading into dataset in hugging face
#@title Data Registration Class
%%writefile DataRegistration.py
import os
import traceback
import inspect
from huggingface_hub import HfApi, create_repo,login,hf_hub_download
class DataRegistration:
def __init__(self,base_path,hf_token=None):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
self.repoID = 'jpkarthikeyan/Tourism-visit-with-us-dataset'
self.Subfolders = os.path.join(base_path,'Data')
self.folder_Master = base_path
self.folder_data = os.path.join(base_path,"Data")
self.hf_token = hf_token
os.makedirs(self.folder_data, exist_ok=True)
print(f"self.Subfolders: {self.Subfolders}")
print(f"self.folder_Master: {self.folder_Master}")
print(f"folder_data: {self.folder_data}")
print('-'*50)
def HFCreateRepo(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
api = HfApi(token=self.hf_token)
create_repo(repo_id=self.repoID,
private=False,
repo_type='dataset',
exist_ok=True)
print(f"Repo {self.repoID} created")
return True
except Exception as ex:
if hasattr(ex,'response') and ex.response.status_code == 409:
print(f"Repo {self.repoID} already exists")
return True
else:
print(f"Exception {ex}")
traceback.print_exc()
return False
finally:
print("-"*100)
def UploadingSourceData(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
source_data_file = os.path.join(self.folder_data,'tourism.csv')
print(f"Soruce Data File {source_data_file}")
if not os.path.exists(source_data_file):
raise FileNotFoundError(f"File {source_data_file} not found")
api = HfApi()
api.upload_file(
path_or_fileobj = source_data_file,
path_in_repo = 'Master/Data/tourism.csv',
repo_id = self.repoID,
repo_type='dataset',
token=self.hf_token)
print(f"Source data tourism.csv uploaded into {self.repoID}")
return True
except Exception as ex:
print(f"Exception at {inspect.currentframe().f_code.co_name} Exception: {ex}")
traceback.print_exc()
return False
finally:
print("-"*100)
def ToRunPipeline(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
if not self.HFCreateRepo():
print('Exception in data registration HFCreateRepo')
return False
else:
print('-'*50)
if not self.UploadingSourceData():
print('Exception in data registration UploadingSourceData')
return False
else:
print('Data Registration Completed')
print('-'*50)
return True
Overwriting DataRegistration.py
2. DATA PREPRATION¶
|--> class DataPrepration
|--> def __init__(self, base_bath, hf_token)
* Contructir for intializing the basepath and hf token
|--> def LoadDatasetFromHF(self)
* Loading the source dataset from the huggingface and saving it in dataframe
|--> def TrainTestSplit(self, df_dataset)
* split the source dataset into Train and Test dataframe
|--> def DatasetCleaning(self,df_data)
* This fucntion will remove the duplicates, replace the missing/nan values
|--> def UploadIntoHF(self,df,drive_path,file_name)
* convert the test and train dataframe into csv file and save it in local
* from local train and test saved csv file into hugging face dataset
|--> def ToRunPipeline(self)
* This function will be invoke the above functions in sequences
* Loading the source file from Hugging face dataset and split into train and test and cleaning the dataset and saving it in local and uploading the train and test into huggingface dataset folder
#@title DataPrepration.py
%%writefile DataPrepration.py
import os
import pandas as pd
import inspect
import traceback
from datasets import load_dataset
from sklearn.model_selection import train_test_split
from huggingface_hub import HfApi, create_repo, login, hf_hub_download
class DataPrepration:
def __init__(self,base_path, hf_token=None):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
self.repoID = 'jpkarthikeyan/Tourism-visit-with-us-dataset'
self.Subfolders = os.path.join(base_path, 'Data')
self.hf_token = hf_token
print(f'self.repoID: {self.repoID}')
print(f'self.Subfolders: {self.Subfolders}')
print('-'*50)
def LoadDatasetFromHF(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
df_dataset = pd.read_csv(hf_hub_download(
repo_id = self.repoID,
filename = 'Master/Data/tourism.csv',
repo_type='dataset'
))
print(f'Shape of the original dataset {df_dataset.shape}')
if 'Unnamed: 0' in df_dataset.columns:
df_dataset = df_dataset.drop(['Unnamed: 0'],axis=1)
print(f"Dataset loaded from {self.repoID}/{self.Subfolders}")
print(f"Shape of the Original Dataset: {df_dataset.shape}")
return df_dataset
except Exception as ex:
print(f"Exception {ex}")
traceback.print_exc()
return None
finally:
print('-'*50)
def TrainTestSplit(self,df_dataset):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
print(f"Value Count {df_dataset['ProdTaken'].value_counts()}")
df_train,df_test = train_test_split(df_dataset,
test_size=0.2,
random_state=42,
stratify=df_dataset['ProdTaken'],
shuffle=True)
print(f"Shape of the train dataset: {df_train.shape}")
print(f"Shape of the test dataset: {df_test.shape}")
return df_train, df_test
except Exception as ex:
print(f'Exception: {ex}')
print(traceback.print_exc())
return None, None
finally:
print('-'*50)
def DatasetCleaning(self,df_data):
try:
print(f"Function Name {inspect.currentframe().f_code.co_name}")
df_data['Gender'] = df_data['Gender'].replace('Fe Male', 'Female')
df_data = df_data.drop_duplicates(subset=['CustomerID'], keep='first').reset_index(drop=True)
for clmn in df_data.columns:
if df_data[clmn].dtype in ['int64']:
#print(f"{clmn} replacing the missing value with median")
df_data[clmn] = df_data[clmn].fillna(df_data[clmn].median())
else:
#print(f"{clmn} replacing the missing value with mode")
df_data[clmn] = df_data[clmn].fillna(df_data[clmn].mode()[0])
df_data = df_data.drop(['CustomerID'], axis=1)
numerical_column = df_data.select_dtypes(include=['int64'])
for num_col in numerical_column:
Q1 = df_data[num_col].quantile(0.25)
Q3 = df_data[num_col].quantile(0.75)
IQR = Q3 - Q1
lower = Q1 - 1.5*IQR
upper = Q3 + 1.5*IQR
#df_data[num_col] = df_data[num_col].clip(lower,upper)
return df_data
except Exception as ex:
print(f"Exception {ex}")
print(traceback.print_exc())
return None
finally:
print('-'*50)
def UploadIntoHF(self,df,drive_path,file_name):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
file_path = os.path.join(drive_path,file_name)
df.to_csv(file_path,index=False)
api = HfApi(token = self.hf_token)
api.upload_file(path_or_fileobj =file_path,
path_in_repo= f"Master/Data/{file_name}",
repo_id = self.repoID,
repo_type='dataset',
token=self.hf_token)
print(f"Source data {file_name} uploaded into {self.repoID}")
return True
except Exception as ex:
print(f"Exception: {ex}")
traceback.print_exc()
return False
finally:
print('-'*50)
def ToRunPipeline(self):
try:
print(f"Function Name {inspect.currentframe().f_code.co_name}")
df_dataset = self.LoadDatasetFromHF()
if df_dataset is None:
return False
else:
df_train, df_test = self.TrainTestSplit(df_dataset)
if df_train is None or df_test is None:
return False
else:
df_train_cleaned = self.DatasetCleaning(df_train)
df_test_cleaned = self.DatasetCleaning(df_test)
if df_train is None or df_test is None:
return False
else:
result_train = self.UploadIntoHF(df_train_cleaned,
self.Subfolders,'train.csv')
result_test = self.UploadIntoHF(df_test_cleaned,
self.Subfolders,'test.csv')
if not result_train or not result_test:
print('Splitted dataset upload into HF Exception')
return False
else:
print('Dataset downloaded from HF, Cleaned, Splitted into train and test dataset and uploaded back into HF dataset')
return True
except Exception as ex:
print(f"Exception message in ToRunPipeline: {ex}")
traceback.print_exc()
return False
finally:
print('-'*50)
Overwriting DataPrepration.py
3.MODEL BUILDING WITH ENSEMBLE TECHNIQUES¶
|--> class BuildingModels
|--> def __init__(self,base_path, hf_token=None)
* Constructor for initializing the basepath and hugging face token
|--> def Load_data_from_HF(self)
* this function will load the train and test dataset
|--> def Preprocessing_dataset(self)
* preprocess the train and test dataset
|--> def Building_Models(self)
* Models will be creating using ensemble techniques Adaboosting,gradient boosting, bagging, decision tree, random forest and XGBoosting
|--> def Model_Evaluation(self)
* Evaluate the modeling based on the f1 score and pick the model with highest f1score
|--> def Register_BestModel_HF(self)
* the Model with highest f1score will be register in the Huggingface Models
|--> def ToRunPipeline(self)
* This function will invoke the above function
#@title BuildingModels.py
%%writefile BuildingModels.py
import os
import joblib
import inspect
import traceback
import mlflow
import mlflow.sklearn
import mlflow.xgboost
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from xgboost import XGBClassifier
from datasets import load_dataset
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.tree import DecisionTreeClassifier
from sklearn.impute import SimpleImputer
from huggingface_hub.utils import RepositoryNotFoundError
from huggingface_hub import HfApi, create_repo, login
from huggingface_hub import hf_hub_download
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import KFold, RandomizedSearchCV
from sklearn.metrics import precision_recall_curve, precision_score
from sklearn.ensemble import RandomForestClassifier, BaggingClassifier
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier
from sklearn.metrics import recall_score, f1_score, classification_report
from sklearn.preprocessing import StandardScaler, OneHotEncoder
class BuildingModels:
def __init__(self,base_path, hf_token=None):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
print(f"Base Path: {base_path}")
self.models = {}
self.best_model = None
self.best_score = 0
self.best_f1_score =0.0
self.best_model_threshold = 0.0
self.best_model_name=None
self.df_train = pd.DataFrame()
self.df_test = pd.DataFrame()
self.feature_train = pd.DataFrame()
self.feature_test = pd.DataFrame()
self.target_train = pd.Series()
self.target_test = pd.Series()
self.base_path = base_path
self.Subfolders = os.path.join(base_path,'data')
self.repo_id = 'jpkarthikeyan/Tourism_Prediction_Model'
self.ds_repo_id = 'jpkarthikeyan/Tourism-visit-with-us-dataset'
self.repo_type = 'model'
self.hf_token = hf_token
mlruns_path = os.path.join(base_path,"mlruns")
print(f"ML Run path: {mlruns_path}")
os.makedirs(mlruns_path, exist_ok=True)
mlflow.set_tracking_uri(f"file://{mlruns_path}")
print(f"Tracking URI file://{mlruns_path}")
experiment = mlflow.set_experiment("Tourism-Prediction-Experiment")
print(f"Experiment ID {experiment}")
self.categorical_columns = ['TypeofContact','Occupation','Gender','ProductPitched','MaritalStatus','Designation']
self.numerical_columns = ['Age','CityTier','DurationOfPitch','NumberOfPersonVisiting',
'NumberOfFollowups','PreferredPropertyStar',
'NumberOfTrips','Passport','PitchSatisfactionScore','OwnCar',
'NumberOfChildrenVisiting','MonthlyIncome']
self.pipeline_numerical = Pipeline(steps=[
('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())
])
self.pipeline_onehot = Pipeline(steps=[
('imputer', SimpleImputer(strategy='most_frequent')),
('onehot', OneHotEncoder(drop='first',handle_unknown='ignore',sparse_output=False))
])
def Load_data_from_HF(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
print(f'Loading the train dataset from {self.ds_repo_id}')
self.df_train = pd.read_csv(hf_hub_download(
repo_id = self.ds_repo_id,
filename = 'Master/Data/train.csv',repo_type='dataset'))
self.df_test = pd.read_csv(hf_hub_download(
repo_id = self.ds_repo_id,
filename = 'Master/Data/test.csv',repo_type='dataset'))
print(f"Shape of the train dataset: {self.df_train.shape}")
print(f"Shape of the train dataset: {self.df_test.shape}")
return True
except Exception as ex:
print(f"Exception: {ex}")
traceback.print_exc()
return False
finally:
print('-'*50)
def Preprocessing_dataset(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
self.target_train = self.df_train['ProdTaken']
self.feature_train = self.df_train.drop(['ProdTaken'],axis=1)
self.target_test = self.df_test['ProdTaken']
self.feature_test = self.df_test.drop(['ProdTaken'],axis=1)
return True
except Exception as ex:
print(f"Exception: {ex}")
traceback.print_exc()
return False
finally:
print('-'*50)
def Building_Models(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
preprocessor = ColumnTransformer(
transformers=[
('num', self.pipeline_numerical,self.numerical_columns),
('onehot', OneHotEncoder(drop='first',handle_unknown='ignore',
sparse_output=False),self.categorical_columns)])
models_params = {
'DecisionTreeClassifier':{
'model': DecisionTreeClassifier(class_weight='balanced',random_state=42),
'params': {'classifier__criterion':['gini','entropy'],
'classifier__splitter':['best','random'],
'classifier__max_depth':[1],
'classifier__min_samples_leaf':[1,2,4],
'classifier__min_samples_split':[2,5,10],
'classifier__max_features':['sqrt','log2',None]}
},
'RandomForestClassifier':{
'model': RandomForestClassifier(class_weight='balanced',random_state=42),
'params': { 'classifier__n_estimators':[25,50,75,100],
'classifier__criterion':['gini','entropy'],
'classifier__max_depth':[5,10,15],
'classifier__min_samples_split':[15,20,25],
'classifier__min_samples_leaf':[7,10,15],
'classifier__max_features':[0.3,0.5,0.6],
'classifier__oob_score':[True],
'classifier__bootstrap':[True]
}
},
'GradientBoostingClassifier':{
'model': GradientBoostingClassifier(random_state=42),
'params':{
'classifier__n_estimators':[50,75,100,125],
'classifier__learning_rate':[0.01,0.5,0.1],
'classifier__criterion':['friedman_mse','squared_error'],
'classifier__max_features':['sqrt','log2'],
'classifier__min_samples_leaf':[1,2,4],
'classifier__subsample':[0.6,0.7,0.8],
'classifier__max_depth':[2,3,4,5]
}
}
}
cv_KFold = KFold(n_splits=3,random_state=42,shuffle=True)
for model_name, mdl_params in models_params.items():
print(f'Model {model_name} started')
with mlflow.start_run(run_name=model_name):
pipeline = Pipeline(steps=[
('preprocessor',preprocessor),
('classifier',mdl_params['model'])
])
random_search = RandomizedSearchCV(pipeline,mdl_params['params'],
n_iter=50,cv=cv_KFold,scoring='f1',
random_state=42,n_jobs=-1,verbose=2)
random_search.fit(self.feature_train,self.target_train)
self.models[model_name] = {
'model':random_search.best_estimator_,
'best_score': random_search.best_score_,
'best_params':random_search.best_params_
}
model_dir = os.path.join(self.base_path,'Model_Dump_JOBLIB')
os.makedirs(model_dir,exist_ok=True)
joblib.dump(random_search.best_estimator_,f'{self.base_path}/Model_Dump_JOBLIB/{model_name}.joblib')
abs_path = os.path.join(self.base_path,'Model_Dump_JOBLIB',f'{model_name}.joblib')
print(f'Model path: {abs_path}')
rel_path = f'Model_Dump_JOBLIB/{model_name}.joblib'
mlflow.log_params(random_search.best_params_)
mlflow.log_metric('best_score',random_search.best_score_)
mlflow.log_artifact(abs_path,artifact_path='models')
print(f'model:{random_search.best_estimator_}')
print(f'best_score: {random_search.best_score_}')
print(f'best_params: {random_search.best_params_}')
print(f'Modle {model_name} completed')
print('-'*50)
return self.models
except Exception as ex:
print(f"Exception: {ex}")
print(traceback.print_exc())
finally:
print('-'*50)
def Model_Evaluation(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
df_metrics = pd.DataFrame()
try:
model_dir = os.path.join(self.base_path,'Model_Dump_JOBLIB')
os.makedirs(model_dir,exist_ok=True)
for mdl_name, mdl_info in self.models.items():
with mlflow.start_run(run_name=f"{mdl_name}_eval"):
model = mdl_info['model']
predict_proability = model.predict_proba(self.feature_test)
print(f"Predict proability shape {mdl_name} {predict_proability.shape}")
if predict_proability.shape[1] ==1:
predict_proability = predict_proability.flatten()
else:
predict_proability = predict_proability[:,1]
prc_precision,prc_recall, prc_threshold = precision_recall_curve(self.target_test,predict_proability)
prc_f1score = 2*((prc_precision*prc_recall) / (prc_precision+prc_recall+1e-10))
prc_threshold_idmx = np.argmax(prc_f1score)
prc_best_threshold = prc_threshold[prc_threshold_idmx]
print(f'best threshold: {prc_best_threshold}')
predic_prob_threshold = (predict_proability >= prc_best_threshold).astype(int)
#predic_prob_threshold = (predict_proability >= 0.5).astype(int)
accuracy = accuracy_score(self.target_test,predic_prob_threshold)
precision = precision_score(self.target_test,predic_prob_threshold)
recall = recall_score(self.target_test,predic_prob_threshold)
f1score = f1_score(self.target_test,predic_prob_threshold)
class_report = classification_report(self.target_test,predic_prob_threshold)
conf_matrix = confusion_matrix(self.target_test,predic_prob_threshold)
lbl = ['TN', 'FP', 'FN', 'TP']
cnf_lbl = ['\n{0:0.0f}'.format(cnf_val) for cnf_val in conf_matrix.flatten()]
cn_percentage = ["\n{0:.2%}".format(item/conf_matrix.flatten().sum()) for item in conf_matrix.flatten()]
confusion_label = np.asarray([["\n {0:0.0f}".format(item)+"\n{0:.2%}".format(item/conf_matrix.flatten().sum())]
for item in conf_matrix.flatten()]).reshape(2,2)
cnf_label = np.asarray([f'{lbl1} {lbl2} {lbl3}' for lbl1, lbl2, lbl3 in zip(lbl, cnf_lbl, cn_percentage)]).reshape(2,2)
plt.figure(figsize = (3,3))
sns.heatmap(conf_matrix, annot = cnf_label, cmap = 'Spectral', fmt='' )
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title(f'{mdl_name} confusion matrix')
plt.tight_layout()
plt.show()
plot_path = os.path.join(self.base_path,'Model_Dump_JOBLIB',f'{mdl_name}_ConfusionMatrix.png')
plt.savefig(plot_path)
plt.close()
mlflow.log_metric('accuracy',accuracy)
mlflow.log_metric('precision',precision)
mlflow.log_metric('recall',recall)
mlflow.log_metric('f1_score',f1score)
mlflow.log_text(class_report,f'{mdl_name}_classification_report.txt')
mlflow.log_artifact(plot_path,artifact_path='models')
df_metrics = pd.concat([df_metrics,pd.DataFrame({'model':[mdl_name],'accuracy':[accuracy],
'precision':[precision], 'recall':[recall],
'f1_score':[f1score]})],ignore_index=True)
print(df_metrics)
if f1score > self.best_f1_score:
self.best_f1_score = f1score
self.best_model_threshold = prc_best_threshold
self.best_model_name = mdl_name
best_model = self.models[self.best_model_name]['model']
if hasattr(best_model, 'feature_importances_'):
feature_importance = pd.DataFrame({
'feature':self.feature_train.columns,
'importance': best_model.feature_importances_
}).sort_values('importance',ascending=False)
print('Feature Importance:\n',feature_importance)
return df_metrics
except Exception as ex:
print(f"Exception: {ex}")
finally:
print('-'*50)
def Register_BestModel_HF(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
best_model = self.models[self.best_model_name]['model']
joblib.dump(best_model,f'{self.base_path}/Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib')
api = HfApi()
try:
api.repo_info(repo_id=self.repo_id,repo_type=self.repo_type)
except RepositoryNotFoundError:
api.create_repo(repo_id=self.repo_id, repo_type=self.repo_type,private=False)
print("Uploading the best model into Hugging face")
api.upload_file(path_or_fileobj = f'{self.base_path}/Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib',
path_in_repo = f"Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib",
repo_id=self.repo_id, repo_type=self.repo_type
)
print("Uploading the best threshold text file to HF")
with open(f'{self.base_path}/Model_Dump_JOBLIB/best_threshold.txt','w') as f:
f.write(str(self.best_model_threshold))
api.upload_file(path_or_fileobj = f"{self.base_path}/Model_Dump_JOBLIB/best_threshold.txt",
path_in_repo = f"Model_Dump_JOBLIB/best_threshold.txt",
repo_id=self.repo_id, repo_type=self.repo_type
)
with mlflow.start_run(run_name=f"Best_{self.best_model_name}"):
input_epl = self.feature_train.head(5)
mlflow.log_metric('best_f1_score',self.best_f1_score)
mlflow.log_metric('best_threshold',self.best_model_threshold)
mlflow.sklearn.log_model(sk_model=best_model,
artifact_path="BestModel",
input_example=input_epl)
mlflow.log_artifact(f'{self.base_path}/Model_Dump_JOBLIB/BestModel_{self.best_model_name}.joblib', artifact_path='models')
mlflow.log_artifact(f'{self.base_path}/Model_Dump_JOBLIB/best_threshold.txt',artifact_path='models')
return True
except Exception as ex:
print(f"Exception: {ex}")
traceback.print_exc()
return False
finally:
print('-'*50)
def ToRunPipeline(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
df_Metrics = pd.DataFrame()
try:
if not self.Load_data_from_HF():
return False
else:
if not self.Preprocessing_dataset():
return False
else:
Build_Model = self.Building_Models()
print(Build_Model)
if Build_Model:
df_Metrics = self.Model_Evaluation()
print(df_Metrics)
if not df_Metrics.empty and df_Metrics is not None:
if self.Register_BestModel_HF():
return True
else:
return False
else:
return False
else:
return False
except Exception as ex:
print(f'Exception occured {ex}')
print(traceback.print_exc())
finally:
print('-'*50)
Overwriting BuildingModels.py
4. Hosting In Hugging Face Streamlit(Front End Implementation)¶
Streamlit deployment requirement.txt file
%%writefile Deployment/requirements.txt
pandas
numpy
scikit-learn==1.6.1
joblib
streamlit
huggingface_hub
setuptools
Overwriting Deployment/requirements.txt
Streamlit deployment Readme file
%%writefile Deployment/README.md
---
title: Visit With Us - Tourism package prediction
emoji: 🚩
colorFrom: blue
colorTo: green
sdk: docker
sdk_version: 3.9
app_file: app.py
app_type: streamlit
pinned: false
license: mit
---
The streamlit app predicts the customer will purchace the tourism package
Overwriting Deployment/README.md
Streamlit deployment Docker file
%%writefile Deployment/Dockerfile
# Use a minimal base image with Python 3.9 installed
FROM python:3.12-slim
# Set the working directory inside the container to /app
WORKDIR /app
# Copy all files from the current directory on the host to the container's /app directory
COPY . .
# Install Python dependencies listed in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
RUN mkdir -p /tmp/hf_cache && chmod -R 777 /tmp/hf_cache
ENV HF_HOME=/tmp/hf_cache
ENV HUGGINGFACE_HUB_CACHE=/tmp/hf_cache
ENV PYTHONUNBUFFERED=1
EXPOSE 7860
# Define the command to run the Streamlit app on port "7860" and make it accessible externally
CMD ["streamlit", "run", "app.py", "--server.port=7860", "--server.address=0.0.0.0", "--server.enableXsrfProtection=false"]
Overwriting Deployment/Dockerfile
|--> app.py
|--> class PredictorTourism
|--> def __init__(self)
* Constructor for prediction tourism
|--> def Load_Model(self):
* This function will load the best model and threshold file from hugging face
|--> def Predict(self, data):
* based on the userinput the model will be predict
|--> front end form creation to get user input
* Front end form creation to get the user input
|--> Invoke Prediction and displaying the prediction output
* Function to invoke the funtion to get the user input and process the prediction
%%writefile Deployment/app.py
import streamlit as st
import pandas as pd
import joblib
import os
import logging
from huggingface_hub import login,hf_hub_download
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
os.environ["STREAMLIT_CONFIG_DIR"] = "/tmp/.streamlit"
cache_dir = "/tmp/hf_cache"
os.environ["HF_HOME"] = cache_dir
os.environ["HUGGINGFACE_HUB_CACHE"] = cache_dir
try:
hf_token = os.getenv("HUGGINGFACE_TOKEN")
if hf_token:
login(token=hf_token)
logger.info("Successfully logged in to Hugging Face")
else:
logger.error("Hugging face token not found")
st.error("Huggingface token not found")
except Exception as ex:
logger.error(f"Failed to login to Hugging face: {ex} ")
st.write(f"Failed to login to Hugging face: {ex} ")
try:
os.makedirs(cache_dir, exist_ok=True)
logger.info(f"Created cache directory {cache_dir}")
except Exception as ex:
logger.error(f"Failed to create cache directory {cache_dir}: {ex}")
st.error(f"Failed to create cache directory {cache_dir}: {ex}")
st.title("Visit with Us: Tourism Package Prediction")
st.write("Enter the Customer details to predict the likehood of purchasing the tourism packages")
if 'predictor' not in st.session_state:
st.session_state.predictor = None
st.session_state.model_loaded = False
class PredictorTourism:
def __init__(self):
self.Subfolders = 'Master'
self.repoID = 'jpkarthikeyan/Tourism_Prediction_Model'
self.model = None
self.best_threshold = 0.0
def Load_Model(self):
try:
logger.info("Loading best model")
model_path = hf_hub_download(
repo_id = self.repoID,filename = f'Model_Dump_JOBLIB/BestModel_GradientBoostingClassifier.joblib',
repo_type = 'model')
threshold_path = hf_hub_download(
repo_id = self.repoID, filename=f'Model_Dump_JOBLIB/best_threshold.txt',
repo_type='model')
logger.info(f"Model path: {model_path}")
logger.info(f"Threshold path: {threshold_path}")
self.model = joblib.load(model_path)
# with open(model_path, 'rb') as f:
# self.model = joblib.load(f)
with open(threshold_path,'r') as f:
self.best_threshold = float(f.read())
st.success("Model and threshold loaded successfully")
return True
except Exception as ex:
st.error(f'Exception: {ex}')
logging.error(f'Exception {ex}')
return False
def Predict(self, data):
try:
logger.info(f"Input Data: {data}")
df= pd.DataFrame([data])
logger.info(f"Data shape: {df.shape}")
logger.info(f"Dataframe columns: {df.columns.tolist()}")
prob = self.model.predict_proba(df)[:,1]
prediction = int(prob >= self.best_threshold)
return prediction
except Exception as ex:
logger.error(f"Exception in predict: {ex}", exc_info=True)
st.error(f"Exception Prediction: {ex}")
return ex
if not st.session_state.model_loaded:
st.session_state.predictor = PredictorTourism()
st.session_state.model_loaded = st.session_state.predictor.Load_Model()
with st.form("customer_form"):
st.header("Customer Details")
col1, col2,col3 = st.columns(3)
with col1:
age = st.number_input("Age", min_value=18, max_value=100, value=41)
gender = st.selectbox('Gender',['Male','Female'])
MaritalStatus = st.selectbox('MaritalStatus',['Married','Unmarried','Single','Divorced'])
Occupation = st.selectbox('Occupation',['Free Lancer','Salaried','Small Business','Large Business'])
Designation = st.selectbox('Designation',['AVP','Manager','Executive','Senior Manager','VP'])
MonthlyIncome = st.number_input('MonthlyIncome',min_value=0, max_value=1000000,value=20999)
with col2:
typeofcontact = st.selectbox("TypeofContact",['Self Enquiry','Company Invited'])
citytier = st.selectbox('citytier',[1,2,3], index=2)
DurationOfPitch = st.number_input('DurationOfPitch', min_value=1, max_value=60, value=6)
ProductPitched = st.selectbox('ProductPitched',['Deluxe','Basic','Standard','Super Deluxe','King'])
PreferredPropertyStar = st.selectbox("'PreferredPropertyStar",[3,2,1])
NumberOfTrips = st.number_input('NumberOfTrips',min_value=0, max_value=30, value=1)
with col3:
NumberOfPersonVisiting = st.number_input('NumberOfPersonVisiting',min_value=1,max_value=10,value=3)
NumberOfFollowups = st.number_input('NumberOfFollowups',min_value=0,max_value=10, value=3)
NumberOfChildrenVisiting= st.number_input('NumberOfChildrenVisiting',min_value=0,max_value=5,value=0)
Passport= st.selectbox('Passport',['Yes','No'],format_func=lambda x:"Yes" if x=="Yes" else "No")
Owncar= st.selectbox('OwnCar',['Yes','No'],format_func=lambda x:"Yes" if x=="Yes" else "No")
PitchSatisfactionScore= st.number_input('PitchSatisfactionScore',min_value=1,max_value=5,value=3)
submitted = st.form_submit_button("Predict")
if submitted:
input_data = {
'Age':age,
'TypeofContact':typeofcontact,
'CityTier':citytier,
'DurationOfPitch':DurationOfPitch,
'Occupation':Occupation,
'Gender':gender,
'NumberOfPersonVisiting':NumberOfPersonVisiting,
'NumberOfFollowups':NumberOfFollowups,
'ProductPitched':ProductPitched,
'PreferredPropertyStar':PreferredPropertyStar,
'MaritalStatus':MaritalStatus,
'NumberOfTrips':NumberOfTrips,
'Passport':1 if Passport =="Yes" else 0,
'OwnCar':1 if Owncar =="Yes" else 0,
'PitchSatisfactionScore':PitchSatisfactionScore,
'NumberOfChildrenVisiting':NumberOfChildrenVisiting,
'Designation':Designation,
'MonthlyIncome':MonthlyIncome
}
if st.session_state.predictor:
result = st.session_state.predictor.Predict(input_data)
if result is not None:
st.subheader(f"Prediction Result is {result}")
st.write(f"Likely to purchase" if result ==1 else "Unlikely to purchase")
else:
st.write(result)
st.error("Error in prediction")
else:
st.error("Models are not loaded, please ensure the model and threshold are available on Hugging face")
Overwriting Deployment/app.py
|--> class HostingInHuggingFace
|--> def __init__(self,base_path,hf_token=None):
* Constructor for intialize the base path and hf token
|--> def CreatingSpaceInHF(self):
* Fucntion to create Hugging face space to upload the file deployment file
|--> def UploadDeploymentFile(self):
* Uploading the deployment file into hugging face space
|--> def ToRunPipeline(self):
* Pipeline function to invoke the the above in sequence
#@title HostingInHuggingFace.py
%%writefile HostingInHuggingFace.py
import os
import inspect
import traceback
from huggingface_hub import HfApi, create_repo, login,hf_hub_download
from huggingface_hub.utils import RepositoryNotFoundError
class HostingInHuggingFace:
def __init__(self,base_path,hf_token=None):
self.base_path = base_path
self.hf_token = hf_token
self.repo_id = 'jpkarthikeyan/Tourism-Prediction-Model-Space'
def CreatingSpaceInHF(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
api = HfApi()
try:
print(f"Checking for {self.repo_id} is correct or not")
api.repo_info(repo_id = self.repo_id,
repo_type='space',
token = self.hf_token)
print(f"Space {self.repo_id} already exists")
except RepositoryNotFoundError:
create_repo(repo_id=self.repo_id,
repo_type='space',
space_sdk='docker',
private=False,
token=self.hf_token)
print(f"Space created in {self.repo_id}")
except Exception as ex:
print(f"Exception in creating space {ex}")
traceback.print_exc()
finally:
print('-'*50)
def UploadDeploymentFile(self):
print(f"Function Name {inspect.currentframe().f_code.co_name}")
try:
api = HfApi(token=self.hf_token)
directory_to_upload = os.path.join(self.base_path,'Deployment')
print(f"Directory to upload {directory_to_upload} into HF Space {self.repo_id}")
api.upload_folder(repo_id=self.repo_id, folder_path=directory_to_upload,
repo_type='space')
print(f"Successfully upload {directory_to_upload} into {self.repo_id}")
return True
except Exception as ex:
print(f"Exception occured {ex}")
print(traceback.print_exc())
return False
finally:
print('-'*50)
def ToRunPipeline(self):
try:
self.CreatingSpaceInHF()
if self.UploadDeploymentFile():
print('Deployment pipeline completed')
return True
else:
print('Deployment pipeline failed')
return False
except Exception as ex:
print(f"Exception occured {ex}")
print(traceback.print_exc())
return False
finally:
print('-'*50)
Overwriting HostingInHuggingFace.py
Main Function¶
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/'
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
os.getcwd()
'/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master'
%%writefile main.py
import os
import sys
import argparse
from dotenv import load_dotenv
try:
base_path = os.path.abspath((os.path.dirname(__file__)))
except:
base_path = os.path.join(os.getcwd(),'Master')
print(base_path)
print(f'Base path {base_path}')
sys.path.append(base_path)
data_dir = os.path.join(base_path, 'Data')
model_dir = os.path.join(base_path,'Model_Dump_JOBLIB')
#job = ['register','prepare']
#job = 'prepare'
parser = argparse.ArgumentParser(description='Run a specific job in the pipeline')
parser.add_argument('--job', type=str, required=True,
choices=['register','prepare','modelbuilding','deploy'],
help='Job To execute register,prepare,modelbuilding,deploy')
args = parser.parse_args()
os.makedirs(data_dir, exist_ok=True)
os.makedirs(model_dir, exist_ok=True)
load_dotenv(dotenv_path=os.path.join(base_path,'.env'))
hf_token = os.getenv('HF_TOKEN')
if not hf_token:
raise ValueError("HF_TOKEN not found in .env file")
if args.job == 'register':
from DataRegistration import DataRegistration
data_reg = DataRegistration(base_path, hf_token)
if not data_reg.ToRunPipeline():
sys.exit(1)
elif args.job == 'prepare':
from DataPrepration import DataPrepration
obj_data_prep = DataPrepration(base_path,hf_token)
if not obj_data_prep.ToRunPipeline():
sys.exit(1)
elif args.job == 'modelbuilding':
from BuildingModels import BuildingModels
ObjBuildModel = BuildingModels(base_path,hf_token)
if not ObjBuildModel.ToRunPipeline():
sys.exit(1)
elif args.job == 'deploy':
from HostingInHuggingFace import HostingInHuggingFace
Obj_deploy = HostingInHuggingFace(base_path,hf_token)
if not Obj_deploy.ToRunPipeline():
sys.exit(1)
Overwriting main.py
#@title Invoking the DataRedistration.py from main.py | !python main.py --job register
!python main.py --job register
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master Function Name __init__ self.Subfolders: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data self.folder_Master: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master folder_data: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data -------------------------------------------------- Function Name ToRunPipeline Function Name HFCreateRepo Repo jpkarthikeyan/Tourism-visit-with-us-dataset created ---------------------------------------------------------------------------------------------------- -------------------------------------------------- Function Name UploadingSourceData Soruce Data File /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data/tourism.csv Source data tourism.csv uploaded into jpkarthikeyan/Tourism-visit-with-us-dataset ---------------------------------------------------------------------------------------------------- Data Registration Completed --------------------------------------------------
#@title Invoking the DataPrepration.py from main.py | !python main.py --job prepare
!python main.py --job prepare
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master Function Name __init__ self.repoID: jpkarthikeyan/Tourism-visit-with-us-dataset self.Subfolders: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data -------------------------------------------------- Function Name ToRunPipeline Function Name LoadDatasetFromHF Shape of the original dataset (4128, 21) Dataset loaded from jpkarthikeyan/Tourism-visit-with-us-dataset//content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Data Shape of the Original Dataset: (4128, 20) -------------------------------------------------- Function Name TrainTestSplit Value Count ProdTaken 0 3331 1 797 Name: count, dtype: int64 Shape of the train dataset: (3302, 20) Shape of the test dataset: (826, 20) -------------------------------------------------- Function Name DatasetCleaning -------------------------------------------------- Function Name DatasetCleaning -------------------------------------------------- Function Name UploadIntoHF Source data train.csv uploaded into jpkarthikeyan/Tourism-visit-with-us-dataset -------------------------------------------------- Function Name UploadIntoHF Source data test.csv uploaded into jpkarthikeyan/Tourism-visit-with-us-dataset -------------------------------------------------- Dataset downloaded from HF, Cleaned, Splitted into train and test dataset and uploaded back into HF dataset --------------------------------------------------
#@title Invoking the BuildingModels.py from main.py | !python main.py --job modelbuilding
!python main.py --job modelbuilding
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
Function Name __init__
ML Run path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns
Tracking URI file:///content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns
Experiment ID <Experiment: artifact_location='file:///content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns/458707103760693873', creation_time=1755962521638, experiment_id='458707103760693873', last_update_time=1755962521638, lifecycle_stage='active', name='Tourism-Prediction-Experiment', tags={}>
Function Name ToRunPipeline
Function Name Load_data_from_HF
Loading the train dataset from jpkarthikeyan/Tourism-visit-with-us-dataset
Shape of the train dataset: (3302, 19)
Shape of the train dataset: (826, 19)
--------------------------------------------------
Function Name Preprocessing_dataset
--------------------------------------------------
Function Name Building_Models
Model DecisionTreeClassifier started
Fitting 3 folds for each of 50 candidates, totalling 150 fits
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=1, classifier__min_samples_split=5, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__min_samples_split=10, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.1s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__min_samples_split=2, classifier__splitter=best; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=gini, classifier__max_depth=1, classifier__max_features=None, classifier__min_samples_leaf=2, classifier__min_samples_split=2, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
[CV] END classifier__criterion=entropy, classifier__max_depth=1, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__min_samples_split=10, classifier__splitter=random; total time= 0.0s
Model path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Model_Dump_JOBLIB/DecisionTreeClassifier.joblib
model:Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler',
StandardScaler())]),
['Age', 'CityTier',
'DurationOfPitch',
'NumberOfPersonVisiting',
'NumberOfFollowups',
'PreferredPropertyStar',
'NumberOfTrips', 'Passport',
'PitchSatisfactionScore',
'OwnCar',
'NumberOfChildrenVisiting',
'MonthlyIncome']),
('onehot',
OneHotEncoder(drop='first',
handle_unknown='ignore',
sparse_output=False),
['TypeofContact',
'Occupation', 'Gender',
'ProductPitched',
'MaritalStatus',
'Designation'])])),
('classifier',
DecisionTreeClassifier(class_weight='balanced', max_depth=1,
min_samples_leaf=2, min_samples_split=5,
random_state=42, splitter='random'))])
best_score: 0.4412564666937607
best_params: {'classifier__splitter': 'random', 'classifier__min_samples_split': 5, 'classifier__min_samples_leaf': 2, 'classifier__max_features': None, 'classifier__max_depth': 1, 'classifier__criterion': 'gini'}
Modle DecisionTreeClassifier completed
--------------------------------------------------
Model RandomForestClassifier started
Fitting 3 folds for each of 50 candidates, totalling 150 fits
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.9s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 1.0s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.7s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.7s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=25, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=15, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=10, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=25, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=5, classifier__max_features=0.5, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.7s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.6, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.2s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=5, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=15, classifier__n_estimators=75, classifier__oob_score=True; total time= 0.4s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 0.8s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.3s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.5, classifier__min_samples_leaf=10, classifier__min_samples_split=20, classifier__n_estimators=100, classifier__oob_score=True; total time= 1.1s
[CV] END classifier__bootstrap=True, classifier__criterion=gini, classifier__max_depth=15, classifier__max_features=0.6, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=25, classifier__oob_score=True; total time= 0.5s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.6s
[CV] END classifier__bootstrap=True, classifier__criterion=entropy, classifier__max_depth=10, classifier__max_features=0.3, classifier__min_samples_leaf=7, classifier__min_samples_split=20, classifier__n_estimators=50, classifier__oob_score=True; total time= 0.4s
Model path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Model_Dump_JOBLIB/RandomForestClassifier.joblib
model:Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler',
StandardScaler())]),
['Age', 'CityTier',
'DurationOfPitch',
'NumberOfPersonVisiting',
'NumberOfFollowups',
'PreferredPropertyStar',
'NumberOfTrips', 'Passport',
'PitchSatisfactionScore',
'OwnCar',
'NumberOfChildrenVisit...
OneHotEncoder(drop='first',
handle_unknown='ignore',
sparse_output=False),
['TypeofContact',
'Occupation', 'Gender',
'ProductPitched',
'MaritalStatus',
'Designation'])])),
('classifier',
RandomForestClassifier(class_weight='balanced',
criterion='entropy', max_depth=15,
max_features=0.6, min_samples_leaf=7,
min_samples_split=20, n_estimators=25,
oob_score=True, random_state=42))])
best_score: 0.6512043836331847
best_params: {'classifier__oob_score': True, 'classifier__n_estimators': 25, 'classifier__min_samples_split': 20, 'classifier__min_samples_leaf': 7, 'classifier__max_features': 0.6, 'classifier__max_depth': 15, 'classifier__criterion': 'entropy', 'classifier__bootstrap': True}
Modle RandomForestClassifier completed
--------------------------------------------------
Model GradientBoostingClassifier started
Fitting 3 folds for each of 50 candidates, totalling 150 fits
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.6s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.8s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.8s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.6; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.9s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.6; total time= 0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.5, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=100, classifier__subsample=0.7; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.6; total time= 0.5s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.6; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=1, classifier__n_estimators=100, classifier__subsample=0.6; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.6; total time= 0.6s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.6; total time= 0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.6; total time= 0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=friedman_mse, classifier__learning_rate=0.1, classifier__max_depth=3, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=75, classifier__subsample=0.7; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.7s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.7s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.1s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.8; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=5, classifier__max_features=sqrt, classifier__min_samples_leaf=1, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.7s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=3, classifier__max_features=sqrt, classifier__min_samples_leaf=2, classifier__n_estimators=100, classifier__subsample=0.8; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time= 0.2s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time= 0.3s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.5, classifier__max_depth=5, classifier__max_features=log2, classifier__min_samples_leaf=2, classifier__n_estimators=50, classifier__subsample=0.6; total time= 0.4s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.5s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.01, classifier__max_depth=2, classifier__max_features=sqrt, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.7; total time= 0.6s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.8s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.8s
[CV] END classifier__criterion=squared_error, classifier__learning_rate=0.1, classifier__max_depth=4, classifier__max_features=log2, classifier__min_samples_leaf=4, classifier__n_estimators=125, classifier__subsample=0.8; total time= 0.6s
Model path: /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Model_Dump_JOBLIB/GradientBoostingClassifier.joblib
model:Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler',
StandardScaler())]),
['Age', 'CityTier',
'DurationOfPitch',
'NumberOfPersonVisiting',
'NumberOfFollowups',
'PreferredPropertyStar',
'NumberOfTrips', 'Passport',
'PitchSatisfactionScore',
'OwnCar',
'NumberOfChildrenVisiting',
'MonthlyIncome']),
('onehot',
OneHotEncoder(drop='first',
handle_unknown='ignore',
sparse_output=False),
['TypeofContact',
'Occupation', 'Gender',
'ProductPitched',
'MaritalStatus',
'Designation'])])),
('classifier',
GradientBoostingClassifier(learning_rate=0.5, max_depth=5,
max_features='log2',
random_state=42, subsample=0.8))])
best_score: 0.6906142009293693
best_params: {'classifier__subsample': 0.8, 'classifier__n_estimators': 100, 'classifier__min_samples_leaf': 1, 'classifier__max_features': 'log2', 'classifier__max_depth': 5, 'classifier__learning_rate': 0.5, 'classifier__criterion': 'friedman_mse'}
Modle GradientBoostingClassifier completed
--------------------------------------------------
--------------------------------------------------
{'DecisionTreeClassifier': {'model': Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler',
StandardScaler())]),
['Age', 'CityTier',
'DurationOfPitch',
'NumberOfPersonVisiting',
'NumberOfFollowups',
'PreferredPropertyStar',
'NumberOfTrips', 'Passport',
'PitchSatisfactionScore',
'OwnCar',
'NumberOfChildrenVisiting',
'MonthlyIncome']),
('onehot',
OneHotEncoder(drop='first',
handle_unknown='ignore',
sparse_output=False),
['TypeofContact',
'Occupation', 'Gender',
'ProductPitched',
'MaritalStatus',
'Designation'])])),
('classifier',
DecisionTreeClassifier(class_weight='balanced', max_depth=1,
min_samples_leaf=2, min_samples_split=5,
random_state=42, splitter='random'))]), 'best_score': np.float64(0.4412564666937607), 'best_params': {'classifier__splitter': 'random', 'classifier__min_samples_split': 5, 'classifier__min_samples_leaf': 2, 'classifier__max_features': None, 'classifier__max_depth': 1, 'classifier__criterion': 'gini'}}, 'RandomForestClassifier': {'model': Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler',
StandardScaler())]),
['Age', 'CityTier',
'DurationOfPitch',
'NumberOfPersonVisiting',
'NumberOfFollowups',
'PreferredPropertyStar',
'NumberOfTrips', 'Passport',
'PitchSatisfactionScore',
'OwnCar',
'NumberOfChildrenVisit...
OneHotEncoder(drop='first',
handle_unknown='ignore',
sparse_output=False),
['TypeofContact',
'Occupation', 'Gender',
'ProductPitched',
'MaritalStatus',
'Designation'])])),
('classifier',
RandomForestClassifier(class_weight='balanced',
criterion='entropy', max_depth=15,
max_features=0.6, min_samples_leaf=7,
min_samples_split=20, n_estimators=25,
oob_score=True, random_state=42))]), 'best_score': np.float64(0.6512043836331847), 'best_params': {'classifier__oob_score': True, 'classifier__n_estimators': 25, 'classifier__min_samples_split': 20, 'classifier__min_samples_leaf': 7, 'classifier__max_features': 0.6, 'classifier__max_depth': 15, 'classifier__criterion': 'entropy', 'classifier__bootstrap': True}}, 'GradientBoostingClassifier': {'model': Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler',
StandardScaler())]),
['Age', 'CityTier',
'DurationOfPitch',
'NumberOfPersonVisiting',
'NumberOfFollowups',
'PreferredPropertyStar',
'NumberOfTrips', 'Passport',
'PitchSatisfactionScore',
'OwnCar',
'NumberOfChildrenVisiting',
'MonthlyIncome']),
('onehot',
OneHotEncoder(drop='first',
handle_unknown='ignore',
sparse_output=False),
['TypeofContact',
'Occupation', 'Gender',
'ProductPitched',
'MaritalStatus',
'Designation'])])),
('classifier',
GradientBoostingClassifier(learning_rate=0.5, max_depth=5,
max_features='log2',
random_state=42, subsample=0.8))]), 'best_score': np.float64(0.6906142009293693), 'best_params': {'classifier__subsample': 0.8, 'classifier__n_estimators': 100, 'classifier__min_samples_leaf': 1, 'classifier__max_features': 'log2', 'classifier__max_depth': 5, 'classifier__learning_rate': 0.5, 'classifier__criterion': 'friedman_mse'}}}
Function Name Model_Evaluation
Predict proability shape DecisionTreeClassifier (826, 2)
best threshold: 0.7076288212958588
Figure(300x300)
Predict proability shape RandomForestClassifier (826, 2)
best threshold: 0.43784253782494276
Figure(300x300)
Predict proability shape GradientBoostingClassifier (826, 2)
best threshold: 0.24867338889476726
Figure(300x300)
--------------------------------------------------
model accuracy precision recall f1_score
0 DecisionTreeClassifier 0.699758 0.326848 0.528302 0.403846
1 RandomForestClassifier 0.868039 0.625000 0.786164 0.696379
2 GradientBoostingClassifier 0.917676 0.766082 0.823899 0.793939
Function Name Register_BestModel_HF
Uploading the best model into Hugging face
BestModel_GradientBoostingClassifier.joblib: 100% 480k/480k [00:00<00:00, 645kB/s]
Uploading the best threshold text file to HF
2025/08/24 04:00:47 WARNING mlflow.models.model: `artifact_path` is deprecated. Please use `name` instead.
/usr/local/lib/python3.12/dist-packages/mlflow/types/utils.py:452: UserWarning: Hint: Inferred schema contains integer column(s). Integer columns in Python cannot represent missing values. If your input data contains missing values at inference time, it will be encoded as floats and will cause a schema enforcement error. The best way to avoid this problem is to infer the model schema based on a realistic data sample (training dataset) that includes missing values. Alternatively, you can declare integer columns as doubles (float64) whenever these columns may have missing values. See `Handling Integers With Missing Values <https://www.mlflow.org/docs/latest/models.html#handling-integers-with-missing-values>`_ for more details.
warnings.warn(
/usr/local/lib/python3.12/dist-packages/mlflow/types/utils.py:452: UserWarning: Hint: Inferred schema contains integer column(s). Integer columns in Python cannot represent missing values. If your input data contains missing values at inference time, it will be encoded as floats and will cause a schema enforcement error. The best way to avoid this problem is to infer the model schema based on a realistic data sample (training dataset) that includes missing values. Alternatively, you can declare integer columns as doubles (float64) whenever these columns may have missing values. See `Handling Integers With Missing Values <https://www.mlflow.org/docs/latest/models.html#handling-integers-with-missing-values>`_ for more details.
warnings.warn(
--------------------------------------------------
--------------------------------------------------
#@title Invoking the HostingInHuggingFace.py from main.py | !python main.py --job deploy
!python main.py --job deploy
Base path /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master Function Name CreatingSpaceInHF Checking for jpkarthikeyan/Tourism-Prediction-Model-Space is correct or not Space jpkarthikeyan/Tourism-Prediction-Model-Space already exists -------------------------------------------------- Function Name UploadDeploymentFile Directory to upload /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Deployment into HF Space jpkarthikeyan/Tourism-Prediction-Model-Space No files have been modified since last commit. Skipping to prevent empty commit. Successfully upload /content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/Deployment into jpkarthikeyan/Tourism-Prediction-Model-Space -------------------------------------------------- Deployment pipeline completed --------------------------------------------------
pipeline.yml¶
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/'
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1
!ls
Master
|--> pipeline.yml
|--> Initializing
|--> jobs
|--> register-dataset
|--> Set up job
|--> Checkout Repository
|-->Setup python
|--> Install Dependencies
|--> List Directory Contents(debug)
|--> CCopy tourism.csv from local
|--> Run DataRegistration
|-->Checkpipeline status
|--> Verfiy upload
|--> post setup python
|--> Post Checkout Repository
|--> Complete Job
|--> data-prepration
|--> Set up job
|--> Checkout Repository
|--> Set up Python
|--> Install Dependencies
|--> Copy tourism.csv
|--> Run DataPrepration.py
|--> Check Pipeline status
|--> Verify Upload
|--> Post Set up Python
|--> Post Checkout Repository
|--> Complete Job
|--> model-building
|--> Set up job
|--> Checkout Repository
|--> Set up Python
|--> Install Dependencies
|--> Create Model Dump Directory
|--> Run Model BuildingModels.py
|--> Check pipeline status
|--> Verify Execution
|--> List Generated Files
|--> Commit and Push Generated Files
|--> Pull Remote Changes
|--> Push Generated Files
|--> Post Setup Python
|--> Post Checkout Repository
|--> complete job
|--> deploy-to-success
|--> Set up job
|--> Checkput Repository
|--> Setup python
|--> INSTALL DEPENDENCIES
|--> Set up Docker Buildx
|--> Debug Authentication
|--> Login to Github Container Registryy
|-->Build and push docker image to Github Container Registry
|--> Deploy to Hugging Faces spaces
|--> Checkout Deployment stauts
|--> Post set up Docker Buildx
|--> Post Setup Python
|--> Post Checkout Repository
|--> Complete Job
%%writefile .github/workflows/pipeline.yml
name: Visit With Us Toursim Prediction Pipeline
on:
push:
branches:
- main # Automatically triggers on push to the main branch
paths:
- 'Master/Data/tourism.csv'
- 'Master/DataRegistration.py'
- 'Master/DataPrepration.py'
- 'Master/BuildingModels.py'
- 'Master/main.py'
- 'Master/HostingInHuggingFace.py'
- '.github/workflows/pipeline.yml'
- 'Master/Deployment/**'
workflow_dispatch:
jobs:
register-dataset:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- name: Setup python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install Dependencies
run: |
python -m pip install --upgrade pip
pip install huggingface_hub python-dotenv
- name: List Directory Contents(Debug)
run: |
ls -la Master/Data/ || echo "Master/Data/ directory not found"
ls -la . || echo "Root Directory contents"
- name: Copy tourism.csv(if using a local file)
run: |
mkdir -p Master/Data
if [ -f tourism.csv ]; then
cp tourism.csv Master/Data/
echo "Copied tourism.csv from root to Master/Data/"
else
echo "tourism.csv not found in root attemtpting to download from hugging face"
python -c "from huggingface_hub import hf_hub_download;hf_hub_download(repo_id='jpkarthikeyan/Tourism-visit-with-us-dataset',filename='tourism.csv',local_dir='Master/Data/')"
fi
- name: Run Data Registration
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
cd Master
python main.py --job register
continue-on-error: false
- name: Check Pipeline status
if: failure()
run: |
echo "Data Registration pipeline failed. please check logs"
exit 1
- name: Verify Upload
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
echo "Verifying Upload on Hugging Face"
python -c "import os;from huggingface_hub import HfApi;api= HfApi(token=os.getenv('HF_TOKEN'));print(api.repo_info(repo_id='jpkarthikeyan/Tourism-visit-with-us-dataset',repo_type='dataset'))"
data-prepration:
runs-on: ubuntu-latest
needs: register-dataset
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install Dependencies
run: |
python -m pip install --upgrade pip
pip install pandas numpy huggingface_hub python-dotenv datasets scikit-learn
- name: Copying tourism.csv
run: |
mkdir -p Master/Data
cp tourism.csv Master/Data || echo "tourism.csv not found in root"
- name: Run DataPrepration.py
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
cd Master
python main.py --job prepare
continue-on-error: false
- name: Check Pipeline Status
if: failure()
run: |
echo "Data Prepration pipeline failed. please check the log"
exit 1
- name: Verify Upload
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
echo "Verifying Upload on Hugging Face"
python -c "import os; from huggingface_hub import HfApi; token = os.getenv('HF_TOKEN');print(HfApi(token=token).repo_info(repo_id='jpkarthikeyan/Tourism-visit-with-us-dataset', repo_type='dataset'))"
model-building:
runs-on: ubuntu-latest
needs: data-prepration
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install Dependencies
run: |
python -m pip install --upgrade pip
pip install huggingface_hub python-dotenv pandas numpy scikit-learn joblib xgboost seaborn matplotlib datasets mlflow
- name: Create Model Dump Directory
run: |
mkdir -p Master/Model_Dump_JOBLIB
mkdir -p Master/mlruns
- name: Set Permission for MLFlow and Model Directories
run: |
mkdir -p Master/mlruns && chmod -R 777 Master/mlruns
mkdir -p Master/Model_Dump_JOBLIB && chmod -R 777 Master/Model_Dump_JOBLIB
- name: Debug Directory Contents
run: |
ls -la Master/
ls -la Master/Model_Dump_JOBLIB/ || echo "Model_Dump_JOBLIB is empty"
ls -la Master/mlruns/ || echo "mlruns is empty"
- name: Run Model Building
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
MLFLOW_TRACKING_URI: file://${{ github.workspace }}/Master/mlruns
run: |
cd Master
python main.py --job modelbuilding
continue-on-error: false
- name: Check pipeline status
if: failure()
run: |
echo "Exception in Build Models. please check logs"
exit 1
- name: Verify Execution
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
echo "Verifying the execution"
python -c "import os; from huggingface_hub import HfApi;token=os.getenv('HF_TOKEN');print(HfApi(token=token).repo_info(repo_id='jpkarthikeyan/Tourism_Prediction_Model',repo_type='model')) "
- name: List Generated Files
run: |
ls -l Master/Model_Dump_JOBLIB/
- name: Commit and Push Generated Files
run: |
git config --global user.name 'github-actions[bot]'
git config --global user.email 'github-actions[bot]@users.noreply.github.com'
git add Master/Model_Dump_JOBLIB/*
git commit -m "Adding genearated model files and confusion matrix plots" || echo "No changes to commit"
git pull origin main --rebase || {
echo "Merge Conflict detectd. Aborting rebase and skipping"
git rebase --abort
exit 0
}
git pull origin main
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Handle Rebase Failure
if: failure()
run:
echo "Rebase failed. cleaning up"
git rebase --abort || true
exit 0
deploy-to-spaces:
runs-on: ubuntu-latest
permissions:
packages: write
contents: read
actions: read
needs: model-building
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- name: SET UP PYTHON
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: INSTALL DEPENDENCIES
run: |
python -m pip install --upgrade pip
pip install huggingface_hub python-dotenv
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Debug Authentication
run: |
echo "Actor: $GITHUB_ACTOR"
echo "PAT_TOKEN is set: ${PAT_TOKEN:+[SET_${#PAT_TOKEN}_chars]}"
if [ -z "$PAT_TOKEN" ]; then
echo "PAT_TOKEN is empty";
else
echo "PAT_TOKEN length: ${#PAT_TOKEN}";
fi
- name: Login to GITHUB CONTAINER REGISTRY
env:
PAT_TOKEN: ${{ secrets.PAT_TOKEN }}
run: |
echo "Login to GITHUB Container Reistry"
echo $PAT_TOKEN | docker login -u ${GITHUB_ACTOR} --password-stdin ghcr.io
echo "Docker Login succss"
- name: Build and Push Docker image to GITHUB Container REGISTRY
env:
PAT_TOKEN: ${{ secrets.PAT_TOKEN }}
run: |
cd Master/Deployment
docker build -t jpkarthik/tourism-prediction-app:latest .
docker tag jpkarthik/tourism-prediction-app:latest ghcr.io/jpkarthik/tourism-prediction-app:latest
docker push ghcr.io/jpkarthik/tourism-prediction-app:latest
- name: Deploy to Hugging Face Spaces
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
cd Master
python main.py --job deploy
echo "Deployment To HF Space"
- name: Check Deployment Status
if: failure()
run: |
echo "Deployment to Huggingface space failed please check logs"
exit 1
Overwriting .github/workflows/pipeline.yml
ngrok¶
os.getcwd()
'/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master'
import os
import subprocess
from google.colab import userdata
from pyngrok import ngrok
import mlflow
ngrok_key = userdata.get('ngrok_key')
print(ngrok_key)
ngrok.set_auth_token(ngrok_key)
mlrun_path = os.path.join(base_path,'mlruns')
os.environ['MLFLOW_TRACKING_URI'] = f'file://{mlrun_path}'
mlflow.set_tracking_uri(f"file://{mlrun_path}")
print(mlflow.get_tracking_uri())
mlflow_process = subprocess.Popen(
["mlflow", "ui", "--host", "0.0.0.0","--port", "5000"],
stdout = subprocess.PIPE,
stderr = subprocess.PIPE,
preexec_fn = os.setsid
)
import time
time.sleep(5)
try:
import requests
response = requests.get("http://localhost:5000")
if response.status_code ==200:
print("MLFlow UI is running")
else:
print(f"MLFlow is not running: {response.status_code}")
except Exception as ex:
print(f"MLFlow is not running: {ex}")
stdout,stderr = mlflow_process.communicate()
print(f"stdout: {stdout.decode()}")
print(f"stderr: {stderr.decode()}")
public_url = ngrok.connect(5000)
print(f"Mlflow UI running at: {public_url}")
31gqLnmI3u79Sy91o3rDCpVDEuD_6ZUJqvvEfDqbhN6rFxfaM file:///content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/mlruns MLFlow UI is running Mlflow UI running at: NgrokTunnel: "https://ed778018fffb.ngrok-free.app" -> "http://localhost:5000"
GIT STEPS TO PUSH THE CODE FROM LOCAL TO REMOTE¶
STEP1
from google.colab import drive
drive.mount('colab/drive')
Step 02 %cd content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/
Step 03
apt-get install git
Step 04
git init
Step 06
git config --global user.email "jpkaimlgl@gmail.com"
git config --global user.name "jpkarthik"
Step 07
git remote add origin https://github.com/jpkarthik/VisitWithUs-ColabNotebook/
Step 08
git add Master/**
Step 09
git add .github/workflows/pipeline.yml
Step 10
git commit -m "Comments"
Step 11 git push origin main
MLFLOW SCREENSHOT¶
MLFLOW Home page¶
Best Models¶
Experiments--> Tourism-Prediction-Experiment --> Run¶
Best Model Metrics page¶
HuggingFace Dataset¶
HuggingFace dataset FileVersion¶
HuggingFace Models Page¶
App Page -- > Unlikely to purchase¶
App Page --> Likely to purchase¶
GITHUB FOLDER STRUCTURE¶
GITHUB PIPELINE EXECUTION¶
GITHUB ACTIONS TAB¶
BUSINESS RECOMENDATION AND CONCLUSIONN¶
The Model was created using the DTree,Random Forest classifier and GradientBoosting classifier
From the training Gradient Boosting Classifier was gievn highest metrices
- Accuracy 91%
- Precision 76%
- Recall 83%
- F1-Score 79%
Business Recomendation Based on the Gradient Boosting Classifiers performance and the tourism prediction context, here are actionable business recomendation for VisitWithUs to Oprimize customer targeting and increase package sales:
- Priortize High-Potential Customers:
Use the GradientBoosting Classifier to identify customers with a high likehood of purchasing tourism packages. the models high recall(83%) ensures that most potential buyers are captured, reduced missed opportunties
Focus marketting effots(e.g., personalized emails, discounts or tailored promotions) on customers predicted as positive by the model, expecially those above the best threshold(0.7076 as identified in previous logs)
- Optimize MArketing Resources:
The model's precision 76.61% indicated that 76.61% of predicted positive cases are correct, helping to minimize wasted resources on unlikely customers. allocate budgets to high- probability leads to improve return on ROI
Use features importance from gradient bossitng to understand key drivers of purchased decision(e.g., Age, MonthlyIncome, NumberOfTrips). Tailor to emphasize features that resonate with high-value customers for luxury packages for high income customers
- Enhance Customer Engagement
- For Customer with lower predicition probabilietes develop nuturing campaignsto convert them over time
- Leverage the models insights to segment customers by demographies or behaviour(e.g., TypeOfContact, Occupation, Designation) for personalized engagement strategies
- Monitor and Refine Model performance
- Continously track the models performance in production using metrics like F1-Score and accuracy update the model with new customer data to maintain its predictive power
- Streamline Operations with autoation:
- Integrate the deployed streamlit app into the company CRM system to provide real -time predictions for sales teams. This enables quick decision making during customer interactions
- Automate followup processes for high probability leads using the app's output reducing manual effort and improving efficiency
from google.colab import drive
drive.mount('/content/drive/')
%cd '/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master/'
Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/MyDrive/PGP_AI_ML_GREAT_LEARNING/10_Advance_Machine_Learning_And_MLOps/Final_Project/VisitWithUs-Tourism_version_1_1/Master
!ls
BuildingModels.py main.py Data Model_Dump_JOBLIB DataPrepration.py __pycache__ DataRegistration.py README.md Deployment Visit-With-Us-Tourism-Prediction_v1_1.ipynb HostingInHuggingFace.py
!pip install nbconvert
Requirement already satisfied: nbconvert in /usr/local/lib/python3.12/dist-packages (7.16.6) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (4.13.4) Requirement already satisfied: bleach!=5.0.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (6.2.0) Requirement already satisfied: defusedxml in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.7.1) Requirement already satisfied: jinja2>=3.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.6) Requirement already satisfied: jupyter-core>=4.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.8.1) Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.3.0) Requirement already satisfied: markupsafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.0.2) Requirement already satisfied: mistune<4,>=2.0.3 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.3) Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.10.2) Requirement already satisfied: nbformat>=5.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.10.4) Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from nbconvert) (25.0) Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (1.5.1) Requirement already satisfied: pygments>=2.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (2.19.2) Requirement already satisfied: traitlets>=5.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.7.1) Requirement already satisfied: webencodings in /usr/local/lib/python3.12/dist-packages (from bleach!=5.0.0->bleach[css]!=5.0.0->nbconvert) (0.5.1) Requirement already satisfied: tinycss2<1.5,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (1.4.0) Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.12/dist-packages (from jupyter-core>=4.7->nbconvert) (4.3.8) Requirement already satisfied: jupyter-client>=6.1.12 in /usr/local/lib/python3.12/dist-packages (from nbclient>=0.5.0->nbconvert) (6.1.12) Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (2.21.2) Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (4.25.1) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (2.7) Requirement already satisfied: typing-extensions>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (4.14.1) Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (25.3.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (2025.4.1) Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.36.2) Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.27.0) Requirement already satisfied: pyzmq>=13 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (26.2.1) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.9.0.post0) Requirement already satisfied: tornado>=4.1 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.4.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.1->jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.17.0)
!jupyter nbconvert --to html Visit-With-Us-Tourism-Prediction_v1_1.ipynb
[NbConvertApp] Converting notebook Visit-With-Us-Tourism-Prediction_v1_1.ipynb to html [NbConvertApp] Writing 4837072 bytes to Visit-With-Us-Tourism-Prediction_v1_1.html